Skip to content

[fix](test) Fix wrong split count assertion in test_hive_compress_type_large_data#62360

Merged
morningman merged 1 commit intoapache:masterfrom
kaka11chen:fix_test_hive_compress_type_large_data
Apr 13, 2026
Merged

[fix](test) Fix wrong split count assertion in test_hive_compress_type_large_data#62360
morningman merged 1 commit intoapache:masterfrom
kaka11chen:fix_test_hive_compress_type_large_data

Conversation

@kaka11chen
Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

test_hive_compress_type_large_data fails because the second explain block hardcodes inputSplitNum=16 for file_split_size=8MB, but on multi-BE clusters where parallelExecInstanceNum * backendNum > 16, count pushdown sets needSplit=true, causing files to be split by 8MB and producing 82 splits instead of 16.

The first explain block already used dynamic logic to handle this case, but the second block did not. Fix: apply the same dynamic expectedSplitNum logic to both explain blocks.

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

…e_large_data

Problem Summary: `test_hive_compress_type_large_data` fails because the
second explain block hardcodes `inputSplitNum=16` for `file_split_size=8MB`,
but on multi-BE clusters where `parallelExecInstanceNum * backendNum > 16`,
count pushdown sets `needSplit=true`, causing files to be split by 8MB and
producing 82 splits instead of 16.

The first explain block already used dynamic logic to handle this case, but
the second block did not. Fix: apply the same dynamic expectedSplitNum logic
to both explain blocks.
Copilot AI review requested due to automatic review settings April 10, 2026 12:18
@kaka11chen
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates a flaky Hive regression test by making its expected split-count assertions adapt to cluster parallelism, fixing failures on multi-BE clusters where count pushdown enables file splitting.

Changes:

  • Introduces a shared needSplit condition based on parallelExecInstanceNum * backendNum to determine whether splitting is expected.
  • Applies dynamic expected split counts to both explain blocks (for file_split_size=0 and file_split_size=8MB) instead of hardcoding inputSplitNum=16.
  • Expands inline comments to document the split-count expectations for each scenario.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

yiguolei pushed a commit that referenced this pull request Apr 11, 2026
…e_large_data (#62361)

### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

Cherry-pick #62360 

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 13, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

@morningman morningman merged commit 01495e0 into apache:master Apr 13, 2026
39 of 40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.1.0-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants